arXiv:1509.07179vl [cs.LG] 23 Sep 2015 


? (?) ??-?? 


Submitted ??; Published ??/?? 


IllinoisSL: A JAVA Library for Structured Prediction 

Kai-Wei Chang* kw@kwchang.net 

Microsoft Research New England, MA 

Shyam Upadhyay upadhya3@illinois.edu 

Department of Computer Science, University of Illinois at Urbana-Champaign, IL 

Ming-Wei Chang minchang@microsoft.com 

Microsoft Research, Redmond, WA 

Vivek Srikumar* svivek@cs.utah.edu 

School of Computing at the University of Utah, UT 

Dan Roth danr@illinois.edu 

Department of Computer Science, University of Illinois at Urbana-Champaign, IL 


Editor: 


Abstract 

IllinoisSL is a Java library for learning structured prediction models. It supports struc¬ 
tured Support Vector Machines and structured Perceptron. The library consists of a core 
learning module and several applications, which can be executed from command-lines. 
Documentation is provided to guide users. In Comparison to other structured learning 
libraries, IllinoisSL is efficient, general, and easy to use. 


1. Introduction 

Structured prediction models have been widely used in several fields, ranging from natural 
language processing, computer vision, and bioinformatics. To make structured prediction 
more accessible to practitioners, we present IllinoisSL, a Java library for implementing 
structured prediction models. Our library supports fast parallelizable variants of commonly 
used models like Structured Support Vector Machines (SSVM) and Structured Perceptron 
(SP), allowing users to use multiple cores to train models more efficiently. Experiments 
on part-of-speech (POS) tagging show that models implemented in IllinoisSL achieve the 
same level of performance as a well-known C++ implementation of Structured 

SVM, in one-sixth of its training time. To the best of our knowledge, IllinoisSL is the 
first fully self-contained structured learning library in Java. The library is released under 
NCSA licenc^ providing freedom for using and modifying the software. 


*. Most of this work was done while the author was at the University of Illinois, supported by DARPA, 
under under agreement number FA8750-13-2-0008. The U.S. Government is authorized to reproduce 
and distribute reprints for Governmental purposes notwithstanding any copyright notation thereon. 
The views and conclusions contained herein are those of the authors and should not be interpreted as 
necessarily representing the official policies or endorsements, either expressed or implied, of DARPA or 
the U.S. Government. 

1. http://opensource.org/licenses/NGSA 
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Task 

X 

y 

InfSolver 

FeatureGenerator 

POS 

Tagging 

sentence 

tag 

sequence 

Viterbi 

Emission and 

Transition Features 

Pependency 

Parsing 

sentence 

dependency 

tree 

C hu- L iu- Edmonds 

Edge features 

Cost-Sensitive 

Multiclass 

document 

document 

category 

argmax 

document features 


Table 1: Examples of applications implemented in the library. 


IllinoisSL provides a generic interface for building algorithms to learn from data. 
A developer only needs to define the input and the output structures, and specify the 
underlying model and inference algorithm (see Sec. 3). Then, the parameters of the model 
can be estimated by the learning algorithms provided by library. The generality of our 
interface allows users to switch seamlessly between several learning algorithms. 

The library and documentation are available at http://cogcomp.cs.illinois.edu/ 
page/software_view/illinois-sl, 


2. Structured Prediction Models 


This section introduces the notation and briefly describes the learning algorithms. We are 
given a set of training data V — where instances G T are annotated with 

structured outputs JJi ^ yi, and Tz is a set of feasible structures for the instance. 


Structured SVM (Taskar et ah, 2004; Tsochantaridis et ah, 2005) learns a weight vector 
w ^ My hy solving the following optimization problem: 


mm 


l-w^w+cy^^f s.t. w^^{xi,yi)-w^^{xi,y)>'ii,y^yi. (1) 


where $(x, y) is a feature vector extracted from both input x and output y. The constraints 
in 0 force the model to assign higher score to the correct output strcture y^ than to others. 

is a slack variable and we use loss to penalize the violation in the objective function 
0. IllinoisSL supports two algorithms to solve Q, a dual coordinate descent method 
(PCD ) ([Chang et al. , 2010; Chang and Yih, 2013| ) and a parallel PCD algorithm, PEMI- 
PCP (Chang et ah, 2013). 


IllinoisSL also provides an implementation of Structured Perceptron ( Collins| |2002 ). 
At each step. Structured Perceptron updates the model using one training instance 
by y ^ argmaxy^y. y), u; ^ w -\- ri{(j){xi^y^) — (l){xi^y))^ where 77 is a learning 


rate. Our implementation includes the averaging trick introduced in Paume III (2006). 


3. IllinoisSL Library 

We provide command-line tools to allow users to quickly learn a model for problems with 
common structures, such as linear-chain, ranking, or a dependency tree. 

The user can also implement a custom structured prediction model through the library 
interface. We describe how to do the latter below. 
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(b) Dependency Parsing 


Figure 1: Accuracy verse training time of two NLP tasks on PTB. 


Library Interface. IllinoisSL requires users to implement the following classes: 

• I Instance: the input x (e.g., sentence in POS tagging). 

• IStructure: the output structure y (e.g., tag sequence in POS tagging). 

• AbstractFeatureGenerator: contains a function FeatureGenerator to extract features 
(j){x^y) from an example pair (x^y). 

• AbstractInfSolver: provides a method for solving inference (i.e., argmax^ y)) 

and for loss-augmented inference ( argmax^ y) + A(y,yJ), and a method 

for evaluating the loss For example, in POS tagging, this class will include 

implementations of a viterbi decoder and the hamming loss, respectively. 

Once these classes are implemented, the user can seamlessly switch between different 
learning algorithms. 

Ready-To-Use Implementations. The IllinoisSL package contains implementations 
of several common NLP tasks including a sequential tagger, a cost-sensitive mulcticlass 
classifier, and an MST dependency parser. Table shows the implmentation details of 
these learners. These implementations provide users with the ability to easily train a model 
for common problems using the command lines, and also serve as examples for using the 
library. The README file provides the details of how to use the library. 
Documentation. IllinoisSL comes with detailed documentations, including JAVA API, 
command-line usage, and a tutorial. The tutorial provides a step-by-step instructions for 
building a POS tagger in 350 lines of JAVA code. Users can post their comments and 
questions about the package toto illinois-ml-nlp-users0cs.uiuc.edu, 

4. Comparison 

To show that iLLlNOlsSL-based implementation of common NLP systems is on par with 
other structured learning libraries, we compare IllinoisSL with SVM^^^^^^and Seqleari|^ 

2. http://WWW.cs.Cornell.edu/people/tj/svm_light/svm_struct.html 

3. https://github.com/larsmans/seqlearn 
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on a Part-of-speech (POS) tagging problem]^ We follow the settings in Chang et al. (2013) 
and conduct experiments on the English Penn Treebank bank (PTB) ( [Marcns et ah ). 

solves an Ll-loss structured SVM problem using a cutting-plane method (Joachims 
et ah, 2009). Seqlearn implemented a structured Perception algorithm for the sequential 
tagging problem. For IllinoisSL, we use 16 CPU cores to train the structured SVM 
model. Default parameters are used. Figure [T^ shows the accuracy along training time of 
each model with default parameters. Despite being a general-purpose package, IllinoisSL 
is more efficient than other^ 

We also implemented a minimum spanning tree based dependency parser using Illi¬ 
noisSL API. The implementation was done in less than 1000 lines of code, with a few hours 
of coding effort. Figure pU| shows the performance of our system in accuracy of head words 
(i.e., unlabeled attachment score). IllinoisSL is competitive with MSTParseij^ a popular 
implementation of dependency parser. 
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